Bayesian modeling of spatially differentiated multivariate enamel defects of the children’s primary maxillary central incisor teeth

Background The analysis of dental caries has been a major focus of recent work on modeling dental defect data. While a dental caries focus is of major importance in dental research, the examination of developmental defects which could also contribute at an early stage of dental caries formation, is also of potential interest. This paper proposes a set of methods which address the appearance of different combinations of defects across different tooth regions. In our modeling we assess the linkages between tooth region development and both the type of defect and associations with etiological predictors of the defects which could be influential at different times during the tooth crown development. Methods We develop different hierarchical model formulations under the Bayesian paradigm to assess exposures during primary central incisor (PMCI) tooth development and PMCI defects. We evaluate the Bayesian hierarchical models under various simulation scenarios to compare their performance with both simulated dental defect data and real data from a motivating application. Results The proposed model provides inference on identifying a subset of etiological predictors of an individual defect accounting for the correlation between tooth regions and on identifying a subset of etiological predictors for the joint effect of defects. Furthermore, the model provides inference on the correlation between the regions of the teeth as well as between the joint effect of the developmental enamel defects and dental caries. Simulation results show that the proposed model consistently yields steady inferences in identifying etiological biomarkers associated with the outcome of localized developmental enamel defects and dental caries under varying simulation scenarios as deemed by small mean square error (MSE) when comparing the simulation results to real application results. Conclusion We evaluate the proposed model under varying simulation scenarios to develop a model for multivariate dental defects and dental caries assuming a flexible covariance structure that can handle regional and joint effects. The proposed model shed new light on methods for capturing inclusive predictors in different multivariate joint models under the same covariance structure and provides a natural extension to a nested hierarchical model. Supplementary Information The online version contains supplementary material available at 10.1186/s12874-024-02211-8.


Background
The analysis of dental caries or tooth decay has been a major focus of recent work on modeling dental data [1][2][3].While a dental caries focus is of major importance in dental research, the examination of enamel quantitative and qualitative defects that could also contribute at an early stage to caries formation is also of potential interest.
In what follows we propose a set of methods that address different combinations of localized defects across different tooth regions of a child's two front teeth, the primary maxillary central incisor teeth (PMCI), also referred to as teeth E and F. These PMCI teeth begin enamel calcification at approximately 14 gestational weeks [4].PMCI teeth are fully calcified on average slightly over onemonth postnatal [5,6] with eruption into the oral cavity at approximately 1 year of age.Given this known duration of tooth development, these PMCI teeth provide an enamel record of exposures during pregnancy, at birth, and during postnatal that may impact the presence of localized developmental defects of the tooth enamel.The three major defects {enamel hypoplasia (EH), opacity (OP), and post-eruptive breakdown (PEB)}, which are illustrated in panels a, b, and c of Fig. 1, and dental caries (DC), treated as an enamel quantitative defect and shown in panel d of Fig. 1, were selected to compare and contrast enamel defect etiological predictors.Furthermore, the PMCI teeth are separated into three nonoverlapping horizontal regions (cervical, middle, and incisal) based on the general sequence of tooth development, which is shown in Fig. 2.
Our methods are motivated by our data to formulate a model that can handle examining how an individual defect is correlated across the regions of the PMCI teeth and examining the joint effect of the defects within a single region of the PMCI teeth.The motivating data were focused on assessing a model of mother and child factors present during pregnancy through delivery and early infancy indicative of the development of localized defects in the PMCI teeth.These data were obtained from maternal longitudinal data from a randomized, controlled trial (RCT) of maternal prenatal vitamin D 3 supplementation [7], a follow-up study of the children [8], and the children's dental imaging data obtained at 2-5 years of age [9].Maternal data were collected monthly from the 12th gestational week of pregnancy through delivery.Maternal predictors included: mother's age; pre-pregnancy body mass index (BMI); number of months during pregnancy that antacids were taken; serum circulating concentrations at 12, 28, and 36 weeks of gestation of calcium (Ca), phosphorus (P), 25-hydroxyvitamin D (25OHD); and intact parathyroid hormone (PTH).The levels of OHD and PTH were used to create the functional vitamin D deficiency (FVDD) ratio (OHD / PTH) at 12, 28, and 36 weeks of gestation.These predictors have a natural correlation to one another; however, for the purposes of this study, we leave examination of their association at a single time point for future work.
The children's data were collected from birth through 4-6 weeks postnatal and at the time of their dental imaging visit once the child reached at least 2 years of age.Child predictors collected at birth through 4-6 weeks' postnatal included: gestational age; early infancy diet (determined by whether the child had received formula by 4-6 weeks of age or was exclusively fed breastmilk); cord blood serum circulating Ca, P, 25OHD, PTH, and 1,25-dihydroxyvitamin D (1,25OH2D); and vitamin D binding protein (VDBP) genotype (focusing on 1s, 1f, and 2 genotypes).Cord blood 25OHD and PTH levels were used to create the FVDD at birth.Child predictors collected during the child's dental imaging visit specific for dental caries include: the age of the child at the time of the imaging, whether the child had ever visited the dentist (DDS), whether fluoride varnish had been put on the child's teeth (FLTX), child's sex, and child's salivary strep mutans count.The primary outcomes, EH, OP, PEB, and DC, were binary for the presence of a defect on the facial surfaces of the PMCI teeth in the cervical, middle, or incisal regions based on digital images from a digital camera (Nikon D90 SLR; Nikon Inc., Melville, NY) fitted with a ring flash and 105-mm macro lens with settings at f/32 aperture, 1/60 shutter speed, and a 3 × magnification.
Our analysis of the dental defect data is based on a Bayesian paradigm.Prior research in oral health that has utilized a Bayesian modeling approach varies depending on the research goal.Komarek et al. [10] implemented a modified version of the intensity model of Harkanen et al. [11] to examine how fluoride-intake affects the time to caries development for the permanent first molars.Bandyopadhyay et al. developed a random effect autologistic Bayesian regression model to assess the effects of exposures on a subject's caries experience, [12] and multivariate spatial beta-binomial model count data that accounts for spatial associations in dental caries research [13].Mutsvari et al. extended the multilevel autologistic Bayesian model to capture the probability of misclassification of caries presence on a surface of a tooth from a function of covariates, [14] and Jin et al. further extended the model to utilize a two-level Bayesian hierarchical model under spatial Markov random field assumptions to mitigate how the presence of dental caries at the tooth and surface level can be mixed complicating any analysis of the data under a unified framework.[15] In our modeling we assess the linkages between tooth region development and both the type of defects and the association with predictors that could be influential at different times during tooth development.In what follows we first develop our Bayesian methodology and propose different model formulations.After that we consider a simulation study that demonstrates the abilities of the approach in providing an appropriate modeling paradigm for the analysis of multiple defect occurrence on tooth regions.Finally, we provide a case study where we apply the methodology to defect and dental caries data scored from photographic images taken from a sample of children's teeth [9].

Methodology
We use Bayesian hierarchical models (BHMs) to assess the relation between defect presence and a range of relevant predictors [16].This approach provides a flexible way to examine the different and complex relations between defects and tooth regions.We also employ special Bayesian variable selection methods (Gibbs variable selection: GVS) to examine all model combinations of predictors [17].
It is comparatively difficult to set up and fit non-Bayesian models when joint occurrences are to be examined or to allow for sophisticated model selection.In addition, BHMs allow the use of random effects so that extra noise in the outcome can be accommodated.In what follows we address the situation where we observe different compositions of defects within different tooth regions.

Bayesian Model-K th Defect Across Regions
Assume we have K defects and these are observed on surfaces of the child's two front PMCI that are further divided into three regions: cervical, middle, and incisal.We assume that the observed presence of defect, Y ijk , for the i th subject ( i = 1, ..., n ) in the j th tooth region ( j = 1, 2, 3 ), and k th defect follows a Bernoulli distribu- tion as where π ijk is the probability of the k th defect in either of tooth E or F (notation of the Universal Numbering [18]) for the i th subject in the j th region.Furthermore, we denote z ik to represent the vector of logits or log-odds across the regions and assume it follows a multivariate normal distribution as where µ ik represents the mean vector of the linear struc- ture of predictors and Σ k is the 3 × 3 covariance matrix capturing the correlation between regions for a given defect.Notably, µ ik is shown as where X i = 1, x i1 , x i2 , ..., x ip T represents a (p + 1) × 1 vector of p possible predictors, and β k = β 0k , β 1k , β 2k , ..., β pk T denotes a (p + 1) × 3 matrix of fixed effects for each j th region.For each k th defect model, the p th fixed effect from the j th region is assumed to follow a normal distribution, β pjk ∼ N (0, σ 2 β ) , with a hyperprior for σ 2 β that follows a gamma distribution, σ 2 β ∼ Gamma(2, 0.5) .In addition, to capture the additional variation due to the regional clustering of the outcome variables, b ik is included as a linear term in the mean vector and is assumed to be independent with zero mean and follows a normal distribution prior, b ik ∼ N (0, σ 2 bk ) , with a hyperprior for σ 2 bk that follows a gamma distribution, σ 2 bk ∼ Gamma(2, 0.5) .This allows Σ k to account for the residual variability in the outcome variables not explained by the fixed and random effects.

Bayesian Model -Multivariate Joint Model for K Defects
Similar to the individual model across regions, we assume a joint model with K defects observed on facial surface of the child subject's two PMCI; however, this model is set within only one region.In our example, we examine a multivariate joint model where j = 3 for the incisal region.We assume the observed presence of defect follows a Bernoulli , where π ijk is the prob- ability of the k th defect in either of tooth E or F for the i th subject in a set j th region.However, now we allow z ij to rep- resent the vector of logits or log-odds of the joint defects and assume it follows a multivariate normal distribution as where µ ij represents the mean vector of the linear struc- ture of predictors and Σ j is the 4 × 4 covariance matrix capturing the correlation between defects for a given region.Furthermore, µ ij is shown as where vector of p possible predictors, and β j = β 0j , β 1j , β 2j , ..., β pj T denotes a (p + 1) × 4 matrix of fixed effects for each k th defect.For each multivariate joint model in a j th region, the p th fixed effect for the k th defect follows a normal dis- tribution, β pjk ∼ N (0, σ 2 β ) , with a hyperprior for σ 2 β that follows a gamma distribution, σ 2 β ∼ Gamma(2, 0.5) .To capture the added variation due to the clustering of the outcome variables within a subject, b ij was included as a linear term in the mean vector and assumed to be independent with zero mean following a normal distribution prior, b ij ∼ N (0, σ 2 bj ) , with a hyperprior for σ 2 bj that follows a gamma distribution, σ 2 bj ∼ Gamma(2, 0.5) .Σ j then accounts for the residual variability in the outcome variables not explained by the fixed and random effects.

Covariance Structure
The size of the correlation structure depends on the whether we are implementing our model for an individual k th defect (which includes Σ k within the model) or a mul- tivariate joint model in the j th region (which includes Σ j ).Using the Cholesky decomposition of the Σ k or Σ j prior into their respective scale and correlation matrix yields where Ω k and Ω j are correlation matrices and τ k as well as τ * j are coefficient scales, respectively.Both Ω k and Ω j each individually assume a LKJ prior distribution in their respective models where Ω k ∼ LKJ (η k ) and Ω j ∼ LKJ η j .η k and η j control how certain the prior is of large correlations between the regions within their respective models.If η k = 1 or η j = 1 , then the density is uniform over the correlation matrix of a given order suggesting uncertainty of whether the regions are correlated.
then extreme correlations are less likely, and if 0 < η k < 1 or 0 < η j < 1 , then correlations between regions are favored though both positive and negative correlations are equally plausible [19].
The LKJ distribution is an extension of the D-vine method [20].The D-vine method uniformly generates random correlation matrices over the space of all positive definite correlation matrices using an appropriate transformation of partial correlation that are then assigned to edges of a regular vine.The LKJ distribution, also referred to as the C-vine method, applies the assignments of partial correlations on the vine to the edges only on those partial correlations that have already been specified on the vine resulting in higher computational efficiency [21].
An alternative to the LKJ distribution could have been to allow Σ k or Σ j to follow from the Wishart or inverse- Wishart distributions.These natural conjugate priors are common choices for a covariance matrix.However, these distributions are lacking flexibility to allow a wider range of uncertainty for variance parameters since they are constrained by a single degree of freedom [22].In addition, since the marginal distribution for the variance of an inverse-Wishart is an inverse gamma distribution, datasets in which low variances are plausible can yield sensitive and biased inferences [23].Lastly, there is a natural dependency between correlations and variances in the inverse-Wishart prior such that small variances are associated with correlations near zero and large variances are associated with correlations near one [24].
Additionally, within an individual defect models, another alternative could have been to allow Σ k to fol- low a spatial correlation structure in an individual defect model to examine the correlation between tooth regions.However, our goal was to implement a flexible model that could examine the correlation between regions, between defects, and in a nested model between both regions and defects.Thus, the correlation chosen for our model measures the pair-wise correlation between regions or between defects and not any spatial correlation within the PMCI teeth.Due to these reasons and the uncertainty of whether the regions are correlated, we assumed a LKJ prior for each k th defect such that Ω k ∼ LKJ (1) in the individual defects models and a LKJ prior for each multivariate joint model in a j th region such that Ω j ∼ LKJ (1) .Furthermore, we assumed weakly informative priors on the scales, τ k ∼ Cauchy(0, 2.5) and τ * j ∼ Cauchy(0, 2.5) , which when combined with Ω k and Ω j formed both Σ k and Σ j respectively.

Bayesian Variable Selection
To evaluate the possible number of alternative linear combinations of predictors, we employ a Bayesian variable selection method known as Gibbs variable selection.While other Bayesian variable selection approaches are available, they are beyond the scope of this paper.Let us examine how the variable selection method would work under an individual defects model for a k th defect across the regions.This would be similar for a multivariate joint model in a j th region except we would exchange subscripts accordingly.
We use an auxiliary indicator variable I pj (where I pj = 1 indicates the presence of the p th predictor in the j th region while I pj = 0 indicates the absence).Under this process, the mean vector of the linear structure of predictors is shown as where each element in the (p + 1) × 3 matrix of ξ k is determined by ξ pj = β pj I pj .Each indicator variable assumes a Bernoulli distribution, I pj ∼ Bernoulli ψ pj , with a hyperprior for ψ pj that follows a Beta distribution,

Missingness (in Application)
Within our application, we assume the data to be missing at random (MAR) and address any missingness during model fitting.Since we fit our model for applications with non-missing analyses, our only missingness was encountered in our predictors.To handle this missingness, we assume that the predictors are a realization from a prior distribution.Any missing values in the predictors are then imputed as parameters and iteratively updated within our Bayesian computational approach.
For instance, let us examine how missingness would be addressed in an individual defect model across regions.We assumed continuous distributions to be normally distributed with a mean of the predictor's non-missing values and the variance, σ 2 jk , to have a Gamma hyperprior where σ 2 jk ∼ Gamma(2, 0.5) .Binary predictors were assumed to follow a Bernoulli distribution with probability, ν jk , with a hyperprior for the prob- ability to be ν jk ∼ Uniform(0, 1) .Last, for the one categorical predictor having three categories, VDBP, we imputed missing values using multinomial regression with maternal race as the predictor included in the model.The categories of maternal race included African American, Caucasian, and Hispanic.This decision was based on prior literature regarding the association between maternal race/ethnicity and genotype towards the type of the child's VDBP.
Each fixed effect in the linear function follows a normal distribution, ζ p * j ∼ N (0, σ 2 ζ ) with each having a hyperprior for σ 2 ζ that follows a Gamma distribution, σ 2 ζ ∼ Gamma(2, 0.5) .The p * denotes whether the fixed effect is the intercept, p * = 0 , or the fixed effect for an individual's maternal race, p * = 1.

Model Computation
The simulation and application of our BHMs were both implemented with the R language [25] using MCMC simulations via the NIMBLE package [26].This package is based on parsed versions of BUGS code; however, it extends the BUGS language for writing new functions or distributions yielding increased flexibility of model specification.Further, it compiles models using its own C + + samplers increasing computational efficiency.All simulation and application models fit were run to convergence and confirmed using the Gelman-Rubin diagnostic.

Case Studies -Application
We applied our methodology to dental defect data scored from photographic images made from a sample of children's teeth to analyze the association of the presence of defects with potential predictors.We implemented our model for two scenarios: an individual defect model for the EH outcome to identify significantly associated predictors accounting for the correlation across the regions (cervical, middle, and incisal) of the child's teeth; a multivariate joint model to identify predictors with a singular or joint association with defects (EH, OP, PEB, and DC) in the incisal region.

Individual Enamel Hypoplasia Model
We fit a model across the three regions to examine how predictors are associated with EH.Though there were 161 total observations in the study sample, we only included observations with non-missing EH outcome measures.The sample size of 148 across the three regions for the model is shown in Table 1.
With the goal of distinguishing biomarkers associated with the presence of an EH defect in any of the cervical, middle, or incisal regions for the primary maxillary central incisors, we assumed to only have 1 defect where k = 1 for the EH defect.Thus, for the i th subject ( i = 1, ..., n ) in the j th tooth region ( j = 1, 2, 3 ), and k = 1 defect follows a Bernoulli distribution as where π ij1 is the probability of the EH defect in either PMCI tooth for the i th subject in the j th region.z i1 rep- resents the vector of log-odds across the regions and is assumed to follow a multivariate normal distribution as where µ i1 represents the mean vector of the linear struc- ture of predictors and Σ 1 as the 3 × 3 covariance matrix capturing the correlation between regions for the EH defect.
We included all potential predictors except for those predictors specific to dental caries: DDS, FLTX, and child's strep mutans count.We fit a full model to conduct Gibbs Variable Selection identifying those predictors with high posterior probabilities of inclusion in at least one region.This model indicated the child's gestational age (incisal region), mother's P at 36 weeks' gestation (cervical region), mother's FVDD ratio at 12 weeks' gestation (cervical region), mother's FVDD ratio at 28 weeks' gestation (cervical region), and child's age at dental imaging visit (cervical and incisal regions) were the only predictors to yield posterior probabilities of inclusion greater than 0.6 in those regions listed.We considered the use of the 0.5 threshold often proposed for Gibbs variable selection [27].However, we needed to balance parsimony and explanatory power, and upon examination, we found the 0.6 threshold provided a better discriminatory performance.Using these predictors, we fit a final reduced model across all three regions.
Table 2 details the posterior parameter means on the odds scale and their 95% credible intervals for the reduced model fit across all three regions.Significant associations include the child's gestational age (incisal region), mother's FVDD ratio at 12 weeks (cervical region), and child's age at visit (cervical and incisal regions).Interpretations for each predictor are similar pending on the predictor being examined.For gestational age, for example, the interpretation is as follows: Holding all other covariates constant, we expect to see an approximate 4% statistically non-significant higher odds of EH in the cervical region, an approximate 11% statistically nonsignificant lower odds of EH in the middle region, and an approximate 22% statistically significant lower odds of EH in the incisal region for a one week increase in gestational age.Furthermore, we observed a consistent directionality of the odds across each region for most predictors with the exceptions of mother's FVDD ratio at 28 weeks' gestation and child's age at dental imaging visit, which have higher odds in the cervical and incisal regions relative to the middle region.Figure 3 depicts the directionality trend in odds across regions.There is a decreasing trend in odds across regions (cervical, middle, incisal) within predictors of the child's gestational age and mother's P at 36 weeks' gestation.Conversely, there is an increasing trend in odds across regions within the predictor of mother's FVDD ratio at 12 weeks.Additionally, while both predictors for mother's FVDD ratio at 12 weeks and child's age at imaging were only statistically significant in the cervical region, the bounds of their respective credible intervals were nearly significant.
Moreover, we examined the posterior correlation between regions shown in Table 3.The diagonal of the correlation matrix between effects degrades due to the LKJ prior distribution where each column indicates the degradation occurring; however, a simple interpretation can be made in the first column as measures indicate a one-to-one correlation between regions.These results reflect a non-statistically significant correlation between the cervical region and either the middle or incisal regions, and although the degradation is present, one can intuitively notice that the correlation between the middle and incisal regions is also not statistically significant.

Multivariate Defects Model
We fit a joint model at the incisal region to examine how predictors are associated under a joint effect for all defects (EH, OP, PEB, and DC).Any observation with the presence of at least one known defect in their PMCI regardless of whether that information was missing for any other defect was included in the model, as summarized in Table 4 below.
We assumed a joint model with 4 defects where k = 1, 2, 3, 4 that represent the defects as 1 (EH), 2 (OP), 3 (PEB), and 4 (DC).Thus, for the i th subject ( i = 1, ..., n ) in the j = 3 incisal region with the k th defect, our model follows a Bernoulli distribution as where π i3k is the probability of the joint presence of a defect in any PMCI for the i th subject in the incisal region.z i3 represents the vector of logits of the joint defects and we assumed it follows a multivariate normal distribution as where µ i3 represents the mean vector of the linear struc- ture of predictors in the incisal region and Σ 3 as the 3 × 3 covariance matrix capturing the correlation between defects in the incisal region.
With dental caries as an outcome in the joint model, the potential predictors from the imaging visit were also included.We fit a full model to conduct Gibbs Variable Selection identifying those predictors with high posterior probabilities of inclusion for at least one defect.This model indicated that the child's gestational age (EH and OP), whether the child was ever on formula (OP and DC), mother's BMI (PEB), mother's Ca at 28 weeks' gestation (EH), mother's P at 28 weeks' gestation (PEB), mother's FVDD ratio at 36 weeks' gestation (OP), child's age at dental imaging visit (EH and PEB), and child's sex (OP) were the only predictors to yield posterior probabilities of inclusion greater than 0.6 in those regions listed.Similar to the individual model, we found that the 0.6 threshold yielded an improved discriminatory performance relative to the 0.5 threshold typically used for Gibbs variable selection [27].Using these predictors, we fit a final reduced joint model for all defects within the incisal region.
Table 5 details the posterior parameter means on the odds scale and their 95% credible intervals for the reduced joint model of all defects.Significant associations include the child's gestational age (EH and OP), mother's FVDD ratio at 36 weeks' gestation (OP and DC) and child's age at dental imaging visit (PEB).Each predictor's interpretation is similar, and taking the child's gestational age as an example, the interpretation is as follows: Holding all other covariates constant, we expect to see an approximate 22% statistically significant   lower odds of EH, an approximate two-fold statistically significant higher odds of OP, an approximate 18% statistically non-significant lower odds of PEB, and an approximate 3% statistically non-significant lower odds of DC in the incisal region for a one week increase in gestational age.The joint relationship between posterior parameter means across the defects is shown in Fig. 4 below.Of note, child's gestational age displays a statistically significant higher odds of OP holding all other predictors constant whereas its association with the remaining outcomes has statistically non-significant lower odds.Conversely, child's age at time of the dental imaging visit indicates higher odds of any defect when holding all other predictors constant, although it is only a statistically significant higher odds of PEB.Furthermore, we examined the posterior correlation between defects shown in Table 6.The diagonal of the correlation matrix between effects degrades due to the LKJ prior distribution where each column indicates the degradation occurring; however, a simple interpretation can be made in the first column as measures indicate a one-to-one correlation between defects.These means indicate a near statistically significant positive correlation between EH and PEB.

Analysis -Simulation Study
A simulation study was implemented to evaluate the ability of our model to consistently identify predictors associated with the presence of a defect in any region and how accurate the obtained posterior predictor estimates relate to the predictor estimates used in the data generation.Random datasets were generated for each iteration using estimates obtained from the original application model.The summary statistics for each defect in the application model and the assumed distributions for each predictor are shown in Table S1 in the supplementary section.
Due to the complexity of the BHM, we constructed three different simulation scenarios based on the individual defect model for EH, OP, PEB, and DC.The first simulation scenario re-fit the original application model for a simulated set of predictors and outcomes given the posterior means of the parameters and the covariance structure from the original application model.The parameter estimates on the log-odds scale (shown in Table 7 for EH) and the covariance structure (shown in Table 8 for EH) are under the Scenario 1 section.The purpose of this scenario was to examine the consistency of our model identifying  predictors with significant inclusion and the predictors' posterior means.The second simulation scenario assumed that the regions were independent of one another (under Scenario 3 in Table 7 for EH) with the purpose of whether our model could identify a different covariance structure relative to the original application.The third simulation scenario assumed only a select number of predictors had a log-odds magnitude greater than 0. The number of non-zero parameter estimates was based on the number of predictors included in the reduced models for each individual defect in the original application.The magnitudes shown in Tables 7 and 8 are based on the individual defect model for enamel hypoplasia, and the other defect models are shown in Tables S2-S6 in the Supplementary section.We fit 50 iterations for each individual model over the three scenarios collecting 2,000 samples after a burn-in of 400,000.We display the results for the simulations for the EH individual defect model; however, we provide additional results for the other defects in the Supplementary section.

Scenario 1: Re-fitting Original Application
Under this scenario, we compared the range of the posterior probabilities of inclusion for each of the predictors obtained from the simulation with their posterior probability of inclusion from the original application model, which is shown below in Fig. 5 (EH) and Figures S3 (OP), S7 (PEB) and S11 (DC) found in the Supplementary section.We observed that nearly all predictors that had high posterior probabilities of inclusion greater than 0.85 in the original application model had inclusion ranges that either included or were exclusively greater than the 0.6 threshold.One exception was in models where the child strep mutans count predictor was originally identified as inclusive in the original application model; however, the range of posterior probabilities of inclusion had difficulty identifying this predictor as inclusive with ranges less than the 0.6 threshold.This difference between the simulations and application was likely caused by the original application's dataset having an upper limit of three strep mutans count; however, our generated data did not truncate at that upper limit, resulting in more variability between the original and simulated data.Our results also showed that predictors that had posterior probabilities of inclusion between 0.6 and 0.85 in the original application model had ranges that varied across both the 0.5 and 0.6 thresholds.More notably, however, non-inclusive predictors from the original application model had posterior probability of inclusion ranges less than 0.6 indicating consistency in not picking up non-inclusive predictors for our reduced models.
Given a predictor returning a posterior probability of inclusion greater than the 0.6 threshold at any iteration of the simulation, the predictor would be used to fit a reduced model while non-inclusive predictors would be removed.Under this premise, we obtained the posterior parameter means on the odds scale for each predictor in a reduced model setting given that they were an inclusive predictor at that iteration.We observed that inclusive predictors had posterior parameter mean ranges had a more noticeable separation from the null odds the higher the posterior probability of inclusion.Furthermore, non-inclusive predictors had posterior parameter mean ranges closely centered around the null odds.These are shown in the Supplementary section in Figures S1 (EH), S4 (OP), S8 (PEB), and S12 (DC).
Finally, we analyzed the error between the posterior parameter means at each iteration (given that they were included in the original application model) and their posterior parameter means from the original application Fig. 5 Displays the median and range of the posterior probabilities of inclusion across the 50 iterations of the simulation for enamel hypoplasia.The horizontal red dashed line denotes the stricter threshold of inclusion (set at 0.6) based on the large number of predictors in our model.The horizontal blue dashed line denotes the original threshold of inclusion at 0.5.The red points shown in each box plot are the full models' posterior probability of inclusion from the original application for that predictor in that region model.The models performed well with small errors for nearly all predictors.The exceptions occurred when the child's strep mutans count predictor was an inclusive predictor in the original application model, which we noted prior as to why that was likely the case.We display the mean-squared error (MSE) results for EH in Fig. 6  The main goal of this scenario, however, was to examine how well our model adjusted to a different correlation structure.Under the LKJ covariance structure, we obtained the error between our assumption of independent regions and the posterior means of covariance.For the EH model, we can note that there are small differences in the assumed covariance structure and the posterior means we returned from our model.The MSE difference between the two structures was small as evidenced by the results shown below in Fig. 7

Scenario 3: Adjusting Posterior Means
Under this scenario, our main goal was to examine how well our model was able to identify predictors that we generated to have non-zero magnitudes of varying strength as inclusive as well as determine their appropriate magnitudes.The range of the posterior probabilities of inclusion for each of the predictors obtained Fig. 6 MSE in posterior parameter estimates given inclusion (EH).Description: Depicts the mean-squared error between the original application's posterior parameter means and the posterior means obtained from the iterations of the simulation given their inclusion for enamel hypoplasia from the simulation was compared with the set posterior means from Table 7, and these results are shown below in Fig. 8 (EH) as well as Figures S29 (OP), S33 (PEB), and S37 (DC) in the Supplementary section.In general, nonzero magnitude predictors had more variable posterior probabilities of inclusion ranges that were also higher than null predictors.Notably, the null predictors' posterior probability of inclusion ranges were at or below the 0.5 threshold, indicating that the model did not inflate the association between null predictors and the defect.One exception to these results was the mother's P at 28 weeks, which consistently had lower posterior probabilities of inclusion.This could be due to the natural correlation it shares with the other maternal P predictors at 12 and 36 weeks resulting in its true inclusion being diminished.
Conditioning on a predictor being inclusive in the iteration of the simulation, we obtained the posterior parameter means on the odds scale for each predictor in a reduced model.Inclusive predictors had posterior parameter mean ranges that had distinct separation from the null odds with evidence that the more inclusive a predictor is in the model, the greater the separation from the null odds.Furthermore, non-inclusive predictors had posterior parameter mean ranges closely centered around the null odds.These are shown in the Supplementary section in Figures S26 (EH), S30 (OP), S34 (PEB), and S38 (DC).
Unlike the other scenarios, we observed more inclusive predictors having greater error between the set posterior parameter means and the posterior parameter means obtained from the simulations.While the child's strep mutans count predictor was problematic throughout all simulations, the mother's age predictor had an increased amount of error between the posterior parameter means relative to the other scenarios.Ultimately, the models performed well with small errors for the other predictors.We display these results in the Supplementary section in Figure S27

Discussion
The underlying theme of the Bayesian model we proposed is to develop a model for spatially multivariate dental defects and dental caries assuming a covariance structure that handles spatial and joint effects.Our goal for this theme is that the model proposed is consistent with alternative models that assume different covariance structures, which avoids needing to use separate models under more rigid covariance structures to model the same datasets under different scopes.By consistent, we mean that inferences made from models remain steady in identifying the appropriate posterior probabilities of inclusion and posterior covariance means under differing scenarios.
Our approach is different from other multivariate or spatial models, which use alternative covariance structures to identify the correlation between joint or spatial effects.Those assumed covariance structures are powerful and commonly used, but they are limited in their rigidness to account for wider uncertainty for variance parameters.An advantage of our approach is to implement a correlation structure with increased flexibility to allow for wider uncertainty for variance parameters while including a tuning parameter that provides greater information to control how certain our prior is of large correlations between regions or joint effects.Ideally, our approach will be extended further to investigate the correlation between both regions and joint effects within the same model.
In the simulation scenarios explored in this paper, our model was consistent in identifying the posterior probabilities of inclusion relative to the results from the original application model and when adjusting the posterior parameter means.Predictors with posterior probabilities of inclusion greater than 0.85 in the original application model had a range of iterations above the 0.6 strict threshold for inclusion under Gibbs variable selection.Furthermore, predictors with posterior probabilities of inclusion between 0.6 and 0.85 in the original application model had posterior inclusion ranges that spanned the 0.5 and 0.6 thresholds.One exception was the child strep mutans count predictor, which could be explained by not truncating the upper limit of the predictor when generating data under the simulation scenario.Additionally, our model adjusted to independent correlations between regions under similar iterations of our generated data.

Conclusion
We proposed and evaluated a set of Bayesian hierarchical models to address the appearance of different combinations of defects across different tooth regions.In our modeling we assess the linkages between tooth region development and both the type of defect and associations with etiological predictors of the defects which could be influential at different times during the tooth crown development.We use BHMs to assess the relation between defect presence and a range of relevant predictors and employ GVS to examine all model combinations of predictors to select a subset of relevant predictors.We assumed a LKJ prior for each model to allow for a wider range of uncertainty in variance parameters and to yield a natural extension to a nested hierarchical model.
The developments in this paper shed new light on methods for capturing inclusive predictors in multivariate joint models or spatial models under the same covariance structure.Both the models fit for an individual developmental enamel defect accounting of the correlation between regions and the joint model for the multivariate developmental enamel defects within a single region yielded inference on subsets of etiological biomarkers associated with the respective outcomes in respective regions.Moreover, these models indicated the correlation between defects and regions in their respective models.The proposed model provides a natural extension to expanding the covariance structure to account for a region by joint defects covariance structure.

Fig. 1
Fig. 1 Panels displaying type of localized defect or dental caries in PMCI teeth as indicated by the yellow arrows.Panel (a) identifies the presence of enamel hypoplasia (EH) indicated by a lesser amount of enamel.Panel (b) shows the presence of post-eruptive breakdown (PEB) identified by the enamel wearing away.Panel (c) displays the presence of opacity (OP) identified by a whiter (or yellower) enamel color.Panel (d) shows the presence of dental caries (DC) identified by the spot of decay in the enamel

Fig. 3
Fig. 3 Enamel hypoplasia profile plot of predictors' 95% credible intervals on the odds scale for those predictors included in the reduced enamel hypoplasia model.The red dashed line denotes the null value of 1.0 on the odds scale

Fig. 4
Fig. 4 Incisal region profile plot of predictors' 95% credible intervals on the odds scale for those predictors included in the reduced joint model of all defects in the incisal region.The red dashed line denotes the null value of 1.0 on the odds scale below to evaluate the goodness.Additional results for MSE and mean absolute error (MAE) are shown in the Supplementary section in Figure S2 (EH MAE), Figure S5 (OP MSE), Figure S6 (OP MAE), Figure S9 (PEB MSE), Figure S10 (PEB MAE), Figure S13 (DC MSE), and Figure S14 (DC MAE).The choice to use MSE and MAE over DIC and WAIC was made to examine the differences between the predicted posterior parameter means from the simulations, and we leave the alternative measures for model selection.Scenario 2: Independent Regions Since there were not enough outcomes of interest in two regions for the original application of the individual PEB model, we ran this simulation scenario for only three individual defect models (EH, OP, and DC).While our goal was to examine the performance of our model to pick up the covariance structure under a different structure relative to the original application model, we also obtained results for how well the models identified posterior probabilities of inclusion and posterior parameter means.Our inferences were similar to what we returned in Scenario 1 for each predictor.The results for the range of posterior probabilities of inclusion are shown in Figures S15 (EH), S18 (OP) and S22 (DC) in the Supplementary section.The results for the posterior parameter means given that the predictors had posterior probabilities of inclusion greater than 0.6 are shown in Figures S16 (EH), S19 (OP) and S23 (DC) in the Supplementary section.

Fig. 7 Fig. 8
Fig. 7 Details the mean-squared error between the original application's posterior means of the covariance structure for the LKJ correlation and the posterior means of the covariance structure obtained from the iterations of the simulation for enamel hypoplasia (EH MSE), Figure S28 (EH MAE), Figure S31 (OP MSE), Figure S32 (OP MAE), Figure S35 (PEB MSE), Figure S36 (PEB MAE), Figure S39 (DC MSE), and Figure S40 (DC MAE).

Table 1
Subjects per defect and region

Table 2
Enamel hypoplasia parameter estimates

Table 4
Subjects per defect in incisal region

Table 5
Parameter estimates (all defects joint model)

Table 7
Parameter magnitudes set for each simulation scenario (EH)Description: Parameter magnitudes used for Scenarios 1 and 2 are based on the posterior means of the parameters from original application model for the enamel hypoplasia defect.Parameter magnitudes for Scenario 3 are randomly assigned with the number of predictors chosen based on the number of predictors that met the threshold for inclusion in the original application model for enamel hypoplasia

Table 8
Magnitudes used for covariance structure (EH) Description: Estimates for the covariance matrix obtained for Scenarios 1 and 3 are based on the posterior means of the LKJ covariance structure from original application model for the enamel hypoplasia defect.Estimates for the covariance matrix for Scenario 2 are under the assumption that the regions are independent